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Piwi-interacting RNAs (piRNAs) 
ensure transposable element silenc- 
ing in Drosophila, thereby preserving 
genome integrity across generations. 
Primary piRNAs arise from the process- 
ing of long RNA transcripts produced in 
the germ line by a limited number of telo- 
meric and pericentromeric loci. Primary 
piRNAs bound to the Argonaute protein 
Aubergine then drive the production of 
secondary piRNAs through the "ping- 
pong" amplification mechanism that 
involves an interplay with piRNAs bound 
to the Argonaute protein Argonaute-3. 
We recently discovered that clusters of 
P-element-derived transgenes produce 
piRNAs and mediate silencing of homol- 
ogous target transgenes in the female 
germ line. We also demonstrated that 
some clusters are able to convert other 
homologous inactive transgene clusters 
into piRNA-producing loci, which then 
transmit their acquired silencing capac- 
ity over generations. This paramutation 
phenomenon is mediated by maternal 
inheritance of piRNAs homologous to 
the transgenes. Here we further mined 
our piRNA sequencing data sets gener- 
ated from various strains carrying trans- 
genes with partial sequence homology 
at distinct genomic sites. This analysis 
revealed that same sequences in differ- 
ent genomic contexts generate highly 
similar profiles of piRNA abundances. 
The strong tendency of piRNAs for bear- 
ing a U at their 5' end has long been 
recognized. Our observations support 



the notion that, in addition, the relative 
frequencies of Drosophila piRNAs are 
locally determined by the DNA sequence 
of piRNA loci. 

Repression of Transposable Elements 
(TEs) in the Drosophila germline by Piwi- 
interacting RNAs (piRNAs) preserves 
genome integrity and prevents the trans- 
mission to next generations of mutations 
induced by TE mobilization. Over the 
past years, major progresses have been 
made in the understanding of the mecha- 
nisms of piRNA biogenesis and activity in 
flies. 1 

The Drosophila melanogaster genome 
carries a limited number ofloci (-140) that 
contain arrays of TE fragments and are 
most often bi-directionally transcribed 
in the germ cells to produce both sense 
and antisense piRNA precursor tran- 
scripts. In contrast, a single flamenco TE 
cluster is uni-directionally transcribed in 
the ovarian follicle cells surrounding the 
germ cells, produces antisense piRNA 
precursors exclusively, and is mainly 
involved in silencing of a specific class 
of retrotransposons called errantiviruses. 
piRNA precursor transcripts in the germ 
cells reach the nuage, a diffuse structure 
surrounding the nucleus where a number 
of components from the piRNA pathway 
accumulate. 2 In the nuage, the cleavage 
of piRNA precursor transcripts by the 
nuclease Zucchini and subsequent 3' 
shortening give rise to primary 23—28- 
nt long piRNAs. 3 ' 4 Antisense primary 
piRNAs bound to the PIWI Argonaute 
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Aubergine (Aub) can then enter the so- 
called "ping-pong" amplification step by 
pairing with sense piRNA precursors. 
This results in the slicing of precursors 
and generates secondary sense piRNAs, 
which, in turn, associate with the PIWI 
Argonaute AG03 and guide the cleav- 
age of more antisense precursors. 5,6 Thus, 
the ping-pong mechanism drives the 
amplification of a population of anti- 
sense piRNAs that overlap by 10 nt with 
sense piRNAs. Indeed, this 10 nt overlap 
"signature" reflects the piRNA-guided 
cleavage of piRNA precursors by PIWI 
Argonautes, which occurs between nucle- 
otide 10 and 11, relative to the 5' end of 
the guide piRNA. As another feature of 
primary piRNAs is their bias for having a 
5' uridine (1U), which is probably due to 
the nucleotide preferences of Aubergine 
and Piwi proteins, the ping-pong mecha- 
nism amplifies of population of AG03- 
associated secondary piRNAs with a bias 
for an adenine at position 10 (10A). 

Although the ping-pong amplifica- 
tion provides an obvious mechanism of 
TE transcript degradation, TE silencing 
by Piwi-bound piRNAs may also operate 
at transcriptional level (TGS). The Piwi 
Argonaute protein distributes in both 
cytoplasm and nucleus. 6,7 In ovarian fol- 
licle cells, which are devoid of Aub and 
AG03, the slicer activity of Piwi protein 
is not required for TE silencing, whereas 
a mutation impairing Piwi nuclear 
localization abolishes TE silencing. 8 '' 
Moreover, in a follicle-derived cell line, 
Piwi-bound piRNAs target TE sequences 
dispersed in euchromatin, reduce their 
Pol II occupancy, and trigger the forma- 
tion of H3K9me3 repressive marks. 10 A 
recent study suggests that Piwi may simi- 
larly mediate TGS in some somatic adult 
tissues." 

In contrast to the progresses made on 
the biogenesis and silencing mechanisms 
by piRNAs, what defines piRNA-produc- 
ing loci and how the production of piR- 
NAs by these loci is regulated is not well 
understood. It is likely that the repetitive 
nature of piRNA loci is an important 
cw-regulatory feature for the production 
of primary piRNAs. Besides, the depo- 
sition of H3K9me3 mark at TE clusters 
by dSETDBl was involved in piRNA 
production 12 and the HPl-family protein 



Rhino binds to dual strand clusters to 
promote their transcription and piRNA 
production. 13 However, the mechanisms 
by which these factors are targeted to 
piRNA loci are still unknown. 

Our recent finding that piRNAs are 
vectors of a paramutation in Drosophila 14 
adds another layer of complexity by 
indicating that maternally deposited 
piRNAs can trigger ^raw-generational 
emergence of some piRNA-producing 
loci. In previous studies, we had shown 
that two PjlArB} transgenes inserted in 
Telomeric Associated Sequences (TAS) 
and containing the Adh and rosy genes 
of D. melanogaster and a bacterial lacZ 
gene, repress germline expression of lacZ 
reporter transgenes inserted at a distance, 
through a homology-dependent silencing 
mechanism called Traw-Silencing Effect 
(TSE). 15,16 These telomeric PflArB} inser- 
tions (hereafter referred to as P-1152) 
mimic natural P-elements whose inser- 
tions in TAS results in the produc- 
tion of P-element-derived piRNAs and 
establishment of maternally transmitted 
P-element repression. 17 " 19 In addition to 
P-1152, we found that T-l, a repeat clus- 
ter of 7 PflacWJ transgenes containing 
the white and lacZ genes and inserted 
in the middle of chromosome arm 2R, 20 
also produces piRNAs and triggers 
strong TSE. 21 In contrast, other P/lacWf 
clusters inserted at the exact same loca- 
tion, including BX2 that has the same 
number of PflacWJ repeats as T-l, did not 
induce detectable TSE. TSE strongly cor- 
relates with piRNA production, as small 
RNA sequencing from P-1152 or T-l but 
not from BX2 ovaries revealed numerous 
transgene-derived piRNAs. Strikingly, 
when BX2 males were crossed with T-l 
females (Fig. 1A), the female prog- 
eny containing the BX2 chromosome 
acquired strong TSE capacity (noted as 
BX2*). This effect was observed without 
T-l chromosome inheritance from the 
T-l mother and was then stably inherited 
over generations. Moreover, when BX2* 
females were crossed with BX2 males 
(Fig. IB), the female progeny contain- 
ing the "naive" BX2 chromosome in turn 
acquired strong TSE capacity (noted as 
BX2* 2 ). Thus, the BX2 to BX2* transi- 
tion is a paramutation, previously defined 
as an epigenetic interaction between two 



alleles of a locus, through which one 
allele induces a heritable modification 
of the other allele without modifying 
the DNA sequence. 22,23 Moreover, the 
acquired and stable TSE capacities of the 
BX2* and BX2* 2 lines correlated with 
the production of a high level of BX2- 
derived piRNAs in ovaries, and were 
abolished in aubergine but not in Dicer-2 
mutants. Altogether, these results imply 
that piRNAs can play the role of a mater- 
nally deposited signal that first triggers 
and then maintains over generations the 
production of piRNA from a previously 
inactive locus (Fig. 1). Interestingly, a 
recent work suggests that resembling 
mechanisms may account for the acqui- 
sition of I-element repression capacity in 
Drosophila strains devoid of functional 
copies of this LINE-like element. 24 

Our small RNA sequencing of small 
RNA libraries 14 prepared using an 
Illumina set of RNA adaptor {Ilium) 
revealed that piRNA abundance profiles 
from T-l and BX2* ovaries after two 
generations (G2) are quite similar. This 
similarity is apparent from the observed 
degree of symmetry when either sense 
(Fig. 2A) or antisense (Fig. 2B). T-l mm 
and BX2*G2„, piRNA abundances were 

Ilium t 

plotted on the same maps. Accordingly, 
abundances of sense as well as antisense 
piRNAs from T-l m and BX2*G2„, 

A ilium Ilium 

showed a strong correlation (Fig. 2C and 
D). Note that although the Spearman 
correlation coefficient (based on rank- 
ing correlation) is less impressive in these 
analyses, it is more appropriate and robust 
than the Pearson correlation coefficient 
(based on linear regression of the values) 
when the data do not necessarily come 
from a bivariate normal distribution, 
which is likely the case for piRNA abun- 
dance variables. Cloning biases impact 
the small RNA libraries generated, 25 
thereby altering quantitation and pos- 
sibly accounting for the strong correla- 
tions between the T-l,,, and BX2*G2„, 

Ilium Ilium 

profiles of piRNA abundances. Indeed, 
these cloning biases were reflected by the 
lower correlations between the sense and 
antisense BX2*G2 ]Uum profiles and the 
sense and antisense BX2*G2 profiles 
obtained under the same genetic settings 
but using another IdT set of RNA adapt- 
ers (Table 1, Pearson cor. 0.38 and 0.26, 
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Figure 1. Paramutation of the 8X2 locus involves maternally inherited T-1 piRNAs. (A) Whereas the T-1 transgene cluster produces piRNAs (small red 
dashes), the 8X2 transgene cluster does not; these distinct properties are completely stable over generations. When T-1 females are crossed to 6X2 
males (GO), the female progeny (GI) that inherited the 8X2 chromosome from fathers and T-7-derived piRNAs from the mother (but not the T-1 chro- 
mosome) start to zygotically produce high levels of 8X2-derived piRNAs. The inactive (blue) to active (red) state transition of the BX2 locus is noted 
with an asterisk (6X2*). This so-called paramutation can be further maternally inherited in the next generations (Gn). (B) Similarly as in (A), maternal 
inheritance of 6X2*-derived piRNAs triggers the state transition of an inactive 6X2 loci in GI, associated to zygotic production of piRNAs. This second- 
order paramutation is noted as BX2* 2 and can be further maternally inherited in the next generations (Gn). The seven repeats of the P{lacW} transgene 
in the T-1 and 6X2 loci are represented by blue or red arrowheads, depending on the states of the loci (active in red, inactive in blue). 
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Figure 2. The profiles of piRNA abundances in T-1 and BX2* ovaries show strong correlations. The numbers of piRNAs (23-28 nt small RNA reads) 
matching the sense strand (A) or the antisense strand (B) of PflacW} in T-1 (blue bars) or 6X2* ovaries (red bars) were plotted relatively to the PflacW} 
nucleotide coordinates. Number of reads of individual sense (C) or antisense (D) piRNA sequences matching P{lacW} in T-7 (x-axes) or 8X2* (y-axes) 
ovaries were plotted in scatter plots. The Ilium index indicates that the small RNA libraries were prepared using the same lllumina set of RNA adapters 
(see ref. 14). The red line corresponds to the linear regression of the data. The Pearson and Spearman correlation coefficients were computed using the 
contest function in R and the methods "pearson" or "spearman." 



Spearman cor. 0.28 and 0.27). However, 
these correlations were still highly signifi- 
cant (lvalues < 2.2e-16 in both Pearson 
and Spearman correlation tests). In addi- 
tion, sense and antisense profiles from 
BX2*G2 UT and T-1... obtained using 
different set of RNA adapters during 
library preparations remain also signifi- 
cantly correlated (Table 1, Pearson cor. 
0.28 and 0.27, Spearman cor. 0.29 and 
0.27, all P values < 2.2e-16). In agree- 
ment with a previous report, 26 these data 
suggest that cloning biases in small RNA 
libraries are not sufficient to explain 
correlations between profiles of piRNA 
abundances and that these profiles are in 
part determined by the DNA sequence 
of piRNA-producing loci. The analy- 
sis of small RNA libraries all prepared 
using the same IdT set of RNA adapters 



further supports this conclusion, as both 
sense and antisense piRNA abundance 
profiles remain strongly correlated in the 
BX2* line after 42 generations (Table 1, 
BX2*G2 ur vs. BX2*G42 UT ) as well as in a 
BX2* 2 line after 36 generations (Table 1, 
BX2*G2 [dT vs. BX2* 2 G36 UT ). 

Interestingly, the P{lArBj transgenes 
of the telomeric P-1152 locus and the 
PflacW/ transgenes at the BX2 and T-1 
loci in the middle of chromosome 2R 
have some DNA sequences in common, 
as evidenced by a dot matrix view of a 
Blast alignment of the two types of trans- 
genes (Fig. 3A). Namely, PflArB} and 
PflacW} share the lacZ gene fused to the 
3' UTR of hsp70 as well as a common 
pBluescript-derived backbone either in 
direct or in inverse orientation relatively 
to lacZ (Fig. 3). In contrast, the Adh 



coding sequences fused to a 620 bp DNA 
fragment of unknown origin and the 
rosy marker gene are specific to PflArBj 
whereas the white marker gene is spe- 
cific to PflacW}. This situation of two 
piRNA-producing loci located at distinct 
genomic positions but sharing partial 
homology allowed us to test whether 
DNA sequences locally impact the pro- 
files of piRNA abundances. To this aim, 
we computed the Spearman correlations 
between all possible 500 nt profile seg- 
ments generated by the sense strand of 
PflArB} in P-1152 ovaries and all possible 
500 nt profile segments generated by both 
sense and antisense strands of PflacW} in 
T-1 ovaries (Fig. 3B, two times -181 mil- 
lions of correlations were computed). We 
repeated the same procedure with all pos- 
sible 500 nt profile segments generated by 
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Table 1. Pearson and Spearman correlation coefficients of 8X2* and T-1 profiles of piRNA abundances 
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Using the indicated sets of RNA adapters {Ilium or IdT) to generate small RNA libraries, we had generated 10 sequencing data sets from BX2 and T-1 
strains under various genetic settings (see text and ref. 14). Sense (s) and antisense (as) piRNA abundance profiles were handled as vectors of 11,690 
variables indexed with the P{lacW} coordinates and taking as values the number of sequenced piRNAs mapped at these coordinates (most 5' nucleo- 
tide location). Pearson (upper table) and Spearman (lower table) correlations between these vectors were then computed in pairwise combinations 
using the R software. Significant correlations (P value < 2.2e-16) are bolded. 



the antisense strand of PflArB '} (Fig. 3C). 
As mentioned above, we chose the 
Spearman correlation because it is more 
stringent when the data do not necessar- 
ily come from a bivariate normal distri- 
bution. In addition, the 500 nt size of 
profile segments was a good compromise 
between the amount of information con- 
tained in segments and the number of 
pairwise correlations to be computed. 
If higher than 0.3 (P value < 2.2e-l6), 
the correlation coefficients were then 
plotted in a dot matrix. As a result, cor- 
relation dot matrixes generated with the 
sense (Fig. 3B) as well as the antisense 
strand (Fig. 3C) of PflArB} echoed the 
dot matrix of Blastn alignment between 
PflArB} and PjlacW}. Thus, the piRNA 
abundance profiles generated by both 
strands of the lacZ sequences at the 
P-1152 or T-1 loci strongly correlated 



with each other. Strikingly, the inverse 
orientations of the Pbluescript-derived 
backbones in P{lArB} and P{lacWj were 
captured in both correlation dot matrixes 
(Fig. 3B and C), indicating that the pro- 
files of piRNA abundances in the P-1152 
and T-1 loci are locally determined by 
DNA sequences but not by their relative 
orientation. 

The extended analysis of our piRNA 
sequencing data sets suggests that the 
relative abundance of piRNAs is locally 
determined in piRNA-producing loci. 
Further supporting this notion, it was 
possible to consistently remap transpo- 
sons and retrotransposons dispersed in 
the genome by scanning genome-wide 
piRNA profiles with a sliding window 
and computing correlations with refer- 
ence profiles for families of transposons 
and retrotransposons (data not shown). 



Several steps in piRNA biogenesis may be 
sequenced-biased, leading to sequence- 
dependent profiles of piRNA abun- 
dances. Cleavage of long RNA precursors 
by the Zucchini RNase may involve 
preferences for local RNA motives and/ 
or local RNA secondary structures. In 
addition to the established preference of 
Aubergine and Piwi for piRNAs starting 
with a 5'U, other sequence motives may 
be responsible for preferential piRNA 
loading into these Argonautes. Finally, 
thermodynamic features and target 
matches that, in turn, depend on piRNA 
sequences may influence the stability of 
piRNAs and, thus, their abundances in 
small RNA libraries. In any case, our data 
favor a model in which local sequence 
rather than long-distance chromosomal 
environment is a primary determinant of 
the abundance profiles of piRNAs. 
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Figure 3. Sequences common to PflacW} 
and P{IArB} in T-1 and P-1152 loci, respectively, 
generate highly similar profiles of piRNA 
abundances. (A) Dot matrix view of a Blastn 
alignment between P{IArB} and P{lacW}. In 
these transgenes, the lacZ and pBluescript 
sequences are 100% identical (black lines) 
and in the same or inverse orientation 
relatively to the P-ends (5' and 3'), respec- 
tively. The 620 bp sequence of the region 
noted N in P{IArB} is unknown. (B) Spearman 
correlation matrix between profiles of piRNA 
abundances from the sense strand of P{IArB} 
and profiles of piRNA abundances from 
either strands of P{lacW}. All possible correla- 
tions between 500 nt sliding windows were 
computed using an in-house python script 
(available upon request) and the stats.spear- 
man function from the scipy python module. 
Spearman correlation coefficients were 
plotted relatively to the 5' coordinates of the 
windows in PflArB} (x-axis) and P{lacW} (y-axis) 
only if higher than 0.32: in green if lower 
than 0.35, in orange when between 0.35 and 
0.4, and in red if higher than 0.4. For clarity, 
piRNA profiles from the indicated strands 
are shown in insets as in Figure 2 for the 
lacZ and pBluescript regions with high profile 
similarities. The antisense pBluescript profile 
from PflacW} was reversed before plotting. 
(C) The Spearman correlation matrix between 
profiles of piRNA abundances from the anti- 
sense strand of PflArB] and profiles of piRNA 
abundances from either strands of P{lacW} 
was computed and displayed as in (B). 
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